Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch DirectX Target to use the Itanium ABI #111632

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pow2clk
Copy link
Contributor

@pow2clk pow2clk commented Oct 9, 2024

To consolidate behavior of function mangling and limit the number of places that ABI changes will need to be made, this switches the DirectX target used for HLSL to use the Itanium ABI from the Microsoft ABI. The Itanium ABI has greater flexibility in decisions regarding mangling of new types of which we have more than a few yet to add.

One effect of this will be that linking library shaders compiled with DXC will not be possible with shaders compiled with clang. That isn't considered a terribly interesting use case and one that would likely have been onerous to maintain anyway.

This involved adding a function to call all global destructors as the Microsoft ABI had done.

This requires a few changes to tests. Most notably the mangling style has changed which accounts for most of the changes. In making those changes, I took the opportunity to harmonize some very similar tests for greater consistency. I also shaved off some unneeded run flags that had probably been copied over from one test to another.

Other changes effected by using the new ABI include using different types when manipulating smaller bitfields, eliminating an unnecessary alloca in one instance in this-assignment.hlsl, changing the way static local initialization is guarded, and changing the order of inout parameters getting copied in and out. That last is a subtle change in functionality, but one where there was sufficient inconsistency in the past that standardizing is important, but the particular direction of the standardization is less important for the sake of existing shaders.

fixes #110736

To consolidate behavior of function mangling and limit the number
of places that ABI changes will need to be made, this switches the
DirectX target used for HLSL to use the Itanium ABI from the
Microsoft ABI. The Itanium ABI has greater flexibility in decisions
regarding mangling of new types of which we have more than a few
yet to add.

This required adding a function to call all global destructors as
the Microsoft ABI had done.

This requires a few changes to tests. Most notably the mangling style
has changed which accounts for most of the changes. In making those
changes, I took the opportunity to harmonize some very similar tests
for greater consistency. I also shaved off some unneeded run flags
that had probably been copied over from one test to another.

Other changes effected by using the new ABI include using smaller
types when possible in a few instances, eliminating an unnecessary
alloca in one instance in this-assignment.hlsl, and changing the
order of inout parameters getting copied in and out.
That last is a subtle change in functionality, but one where there
was sufficient inconsistency in the past that standardizing is
important, but the particular direction of the standardization is
less important for the sake of existing shaders.

fixes llvm#110736
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen backend:DirectX HLSL HLSL Language Support labels Oct 9, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 9, 2024

@llvm/pr-subscribers-backend-directx
@llvm/pr-subscribers-hlsl
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Greg Roth (pow2clk)

Changes

To consolidate behavior of function mangling and limit the number of places that ABI changes will need to be made, this switches the DirectX target used for HLSL to use the Itanium ABI from the Microsoft ABI. The Itanium ABI has greater flexibility in decisions regarding mangling of new types of which we have more than a few yet to add.

This required adding a function to call all global destructors as the Microsoft ABI had done.

This requires a few changes to tests. Most notably the mangling style has changed which accounts for most of the changes. In making those changes, I took the opportunity to harmonize some very similar tests for greater consistency. I also shaved off some unneeded run flags that had probably been copied over from one test to another.

Other changes effected by using the new ABI include using smaller types when possible in a few instances, eliminating an unnecessary alloca in one instance in this-assignment.hlsl, and changing the order of inout parameters getting copied in and out. That last is a subtle change in functionality, but one where there was sufficient inconsistency in the past that standardizing is important, but the particular direction of the standardization is less important for the sake of existing shaders.

fixes #110736


Patch is 142.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111632.diff

48 Files Affected:

  • (modified) clang/lib/Basic/Targets/DirectX.h (+1-1)
  • (modified) clang/lib/CodeGen/ItaniumCXXABI.cpp (+4)
  • (modified) clang/test/CodeGenHLSL/ArrayTemporary.hlsl (+4-4)
  • (modified) clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl (+14-12)
  • (modified) clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl (+4-4)
  • (modified) clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl (+8-4)
  • (modified) clang/test/CodeGenHLSL/GlobalConstructors.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/GlobalDestructors.hlsl (+7-7)
  • (modified) clang/test/CodeGenHLSL/basic_types.hlsl (+32-32)
  • (modified) clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl (+13-13)
  • (modified) clang/test/CodeGenHLSL/builtins/RasterizerOrderedBuffer-annotations.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/StructuredBuffer-annotations.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/StructuredBuffer-elementtype.hlsl (+13-13)
  • (modified) clang/test/CodeGenHLSL/builtins/abs.hlsl (+38-35)
  • (modified) clang/test/CodeGenHLSL/builtins/ceil.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/clamp.hlsl (+50-51)
  • (modified) clang/test/CodeGenHLSL/builtins/cos.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/exp.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/exp2.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/floor.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/hlsl_resource_t.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/builtins/log.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/log10.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/log2.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/max.hlsl (+50-51)
  • (modified) clang/test/CodeGenHLSL/builtins/min.hlsl (+50-51)
  • (modified) clang/test/CodeGenHLSL/builtins/pow.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/round.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/saturate.hlsl (+44-79)
  • (modified) clang/test/CodeGenHLSL/builtins/sin.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/sqrt.hlsl (+18-19)
  • (modified) clang/test/CodeGenHLSL/builtins/trunc.hlsl (+19-20)
  • (modified) clang/test/CodeGenHLSL/export.hlsl (+5-6)
  • (modified) clang/test/CodeGenHLSL/float3.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/group_shared.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/half.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl (+4-4)
  • (modified) clang/test/CodeGenHLSL/inline-constructors.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/inline-functions.hlsl (+5-5)
  • (modified) clang/test/CodeGenHLSL/semantics/GroupIndex-codegen.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/shift-mask.hlsl (+37-6)
  • (modified) clang/test/CodeGenHLSL/sret_output.hlsl (+3-4)
  • (modified) clang/test/CodeGenHLSL/static-local-ctor.hlsl (+7-7)
  • (modified) clang/test/CodeGenHLSL/static_global_and_function_in_cb.hlsl (+3-4)
  • (modified) clang/test/CodeGenHLSL/this-assignment-overload.hlsl (+4-4)
  • (modified) clang/test/CodeGenHLSL/this-assignment.hlsl (+2-5)
  • (modified) clang/test/CodeGenHLSL/this-reference.hlsl (+2-2)
diff --git a/clang/lib/Basic/Targets/DirectX.h b/clang/lib/Basic/Targets/DirectX.h
index cf7ea5e83503dc..19b61252409b09 100644
--- a/clang/lib/Basic/Targets/DirectX.h
+++ b/clang/lib/Basic/Targets/DirectX.h
@@ -62,7 +62,7 @@ class LLVM_LIBRARY_VISIBILITY DirectXTargetInfo : public TargetInfo {
     PlatformName = llvm::Triple::getOSTypeName(Triple.getOS());
     resetDataLayout("e-m:e-p:32:32-i1:32-i8:8-i16:16-i32:32-i64:64-f16:16-f32:"
                     "32-f64:64-n8:16:32:64");
-    TheCXXABI.set(TargetCXXABI::Microsoft);
+    TheCXXABI.set(TargetCXXABI::GenericItanium);
   }
   bool useFP16ConversionIntrinsics() const override { return false; }
   void getTargetDefines(const LangOptions &Opts,
diff --git a/clang/lib/CodeGen/ItaniumCXXABI.cpp b/clang/lib/CodeGen/ItaniumCXXABI.cpp
index 965e09a7a760ec..75dab596e1b2c4 100644
--- a/clang/lib/CodeGen/ItaniumCXXABI.cpp
+++ b/clang/lib/CodeGen/ItaniumCXXABI.cpp
@@ -2997,6 +2997,10 @@ void ItaniumCXXABI::registerGlobalDtor(CodeGenFunction &CGF, const VarDecl &D,
   if (D.isNoDestroy(CGM.getContext()))
     return;
 
+  // HLSL doesn't support atexit.
+  if (CGM.getLangOpts().HLSL)
+    return CGM.AddCXXDtorEntry(dtor, addr);
+
   // OpenMP offloading supports C++ constructors and destructors but we do not
   // always have 'atexit' available. Instead lower these to use the LLVM global
   // destructors which we can handle directly in the runtime. Note that this is
diff --git a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
index 63a30b61440eb5..7d77c0aff736cc 100644
--- a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
@@ -68,11 +68,11 @@ void call4(float Arr[2][2]) {
 // CHECK: [[Tmp2:%.*]] = alloca [4 x float]
 // CHECK: [[Tmp3:%.*]] = alloca [3 x i32]
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp1]], ptr align 4 [[FA2]], i32 8, i1 false)
-// CHECK: call void @"??$template_fn@$$BY01M@@YAXY01M@Z"(ptr noundef byval([2 x float]) align 4 [[Tmp1]])
+// CHECK: call void @_Z11template_fnIA2_fEvT_(ptr noundef byval([2 x float]) align 4 [[Tmp1]])
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp2]], ptr align 4 [[FA4]], i32 16, i1 false)
-// CHECK: call void @"??$template_fn@$$BY03M@@YAXY03M@Z"(ptr noundef byval([4 x float]) align 4 [[Tmp2]])
+// CHECK: call void @_Z11template_fnIA4_fEvT_(ptr noundef byval([4 x float]) align 4 [[Tmp2]])
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp3]], ptr align 4 [[IA3]], i32 12, i1 false)
-// CHECK: call void @"??$template_fn@$$BY02H@@YAXY02H@Z"(ptr noundef byval([3 x i32]) align 4 [[Tmp3]])
+// CHECK: call void @_Z11template_fnIA3_iEvT_(ptr noundef byval([3 x i32]) align 4 [[Tmp3]])
 
 template<typename T>
 void template_fn(T Val) {}
@@ -90,7 +90,7 @@ void template_call(float FA2[2], float FA4[4], int IA3[3]) {
 
 // CHECK: [[Addr:%.*]] = getelementptr inbounds [2 x float], ptr [[FA2]], i32 0, i32 0
 // CHECK: [[Tmp:%.*]] = load float, ptr [[Addr]]
-// CHECK: call void @"??$template_fn@M@@YAXM@Z"(float noundef [[Tmp]])
+// CHECK: call void @_Z11template_fnIfEvT_(float noundef [[Tmp]])
 
 // CHECK: [[Idx0:%.*]] = getelementptr inbounds [2 x float], ptr [[FA2]], i32 0, i32 0
 // CHECK: [[Val0:%.*]] = load float, ptr [[Idx0]]
diff --git a/clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl b/clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl
index 58237889db1dca..6afead4f233660 100644
--- a/clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl
+++ b/clang/test/CodeGenHLSL/BasicFeatures/OutputArguments.hlsl
@@ -260,10 +260,10 @@ void order_matters(inout int X, inout int Y) {
 // CHECK: store i32 [[VVal]], ptr [[Tmp0]]
 // CHECK: [[VVal:%.*]] = load i32, ptr [[V]]
 // CHECK: store i32 [[VVal]], ptr [[Tmp1]]
-// CHECK: call void {{.*}}order_matters{{.*}}(ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp1]], ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp0]])
-// CHECK: [[Arg1Val:%.*]] = load i32, ptr [[Tmp1]]
+// CHECK: call void {{.*}}order_matters{{.*}}(ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp0]], ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp1]])
+// CHECK: [[Arg1Val:%.*]] = load i32, ptr [[Tmp0]]
 // CHECK: store i32 [[Arg1Val]], ptr [[V]]
-// CHECK: [[Arg2Val:%.*]] = load i32, ptr [[Tmp0]]
+// CHECK: [[Arg2Val:%.*]] = load i32, ptr [[Tmp1]]
 // CHECK: store i32 [[Arg2Val]], ptr [[V]]
 
 // OPT: ret i32 2
@@ -289,17 +289,19 @@ void setFour(inout int I) {
 // CHECK: [[B:%.*]] = alloca %struct.B
 // CHECK: [[Tmp:%.*]] = alloca i32
 
-// CHECK: [[BFLoad:%.*]] = load i32, ptr [[B]]
-// CHECK: [[BFshl:%.*]] = shl i32 [[BFLoad]], 24
-// CHECK: [[BFashr:%.*]] = ashr i32 [[BFshl]], 24
-// CHECK: store i32 [[BFashr]], ptr [[Tmp]]
+// CHECK: [[BFLoad:%.*]] = load i16, ptr [[B]]
+// CHECK: [[BFshl:%.*]] = shl i16 [[BFLoad]], 8
+// CHECK: [[BFashr:%.*]] = ashr i16 [[BFshl]], 8
+// CHECK: [[BFcast:%.*]] = sext i16 [[BFashr]] to i32
+// CHECK: store i32 [[BFcast]], ptr [[Tmp]]
 // CHECK: call void {{.*}}setFour{{.*}}(ptr noalias noundef nonnull align 4 dereferenceable(4) [[Tmp]])
 // CHECK: [[RetVal:%.*]] = load i32, ptr [[Tmp]]
-// CHECK: [[BFLoad:%.*]] = load i32, ptr [[B]]
-// CHECK: [[BFValue:%.*]] = and i32 [[RetVal]], 255
-// CHECK: [[ZerodField:%.*]] = and i32 [[BFLoad]], -256
-// CHECK: [[BFSet:%.*]] = or i32 [[ZerodField]], [[BFValue]]
-// CHECK: store i32 [[BFSet]], ptr [[B]]
+// CHECK: [[TruncVal:%.*]] = trunc i32 [[RetVal]] to i16
+// CHECK: [[BFLoad:%.*]] = load i16, ptr [[B]]
+// CHECK: [[BFValue:%.*]] = and i16 [[TruncVal]], 255
+// CHECK: [[ZerodField:%.*]] = and i16 [[BFLoad]], -256
+// CHECK: [[BFSet:%.*]] = or i16 [[ZerodField]], [[BFValue]]
+// CHECK: store i16 [[BFSet]], ptr [[B]]
 
 // OPT: ret i32 8
 export int case11() {
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index b39311ad67cd62..c0eb1b138ed047 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -25,11 +25,11 @@ void main(unsigned GI : SV_GroupIndex) {}
 // CHECK: define void @main()
 // CHECK-NEXT: entry:
 // Verify function constructors are emitted
-// NOINLINE-NEXT:   call void @"?call_me_first@@YAXXZ"()
-// NOINLINE-NEXT:   call void @"?then_call_me@@YAXXZ"()
+// NOINLINE-NEXT:   call void @_Z13call_me_firstv()
+// NOINLINE-NEXT:   call void @_Z12then_call_mev()
 // NOINLINE-NEXT:   %0 = call i32 @llvm.dx.flattened.thread.id.in.group()
-// NOINLINE-NEXT:   call void @"?main@@YAXI@Z"(i32 %0)
-// NOINLINE-NEXT:   call void @"?call_me_last@@YAXXZ"(
+// NOINLINE-NEXT:   call void @_Z4mainj(i32 %0)
+// NOINLINE-NEXT:   call void @_Z12call_me_lastv(
 // NOINLINE-NEXT:   ret void
 
 // Verify constructor calls are inlined when AlwaysInline is run
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl b/clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl
index 78f6475462bc47..09c44f6242c53c 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorLib.hlsl
@@ -13,7 +13,7 @@ void FirstEntry() {}
 // CHECK: define void @FirstEntry()
 // CHECK-NEXT: entry:
 // NOINLINE-NEXT:   call void @_GLOBAL__sub_I_GlobalConstructorLib.hlsl()
-// NOINLINE-NEXT:   call void @"?FirstEntry@@YAXXZ"()
+// NOINLINE-NEXT:   call void @_Z10FirstEntryv()
 // Verify inlining leaves only calls to "llvm." intrinsics
 // INLINE-NOT:   call {{[^@]*}} @{{[^l][^l][^v][^m][^\.]}}
 // CHECK: ret void
@@ -25,7 +25,7 @@ void SecondEntry() {}
 // CHECK: define void @SecondEntry()
 // CHECK-NEXT: entry:
 // NOINLINE-NEXT:   call void @_GLOBAL__sub_I_GlobalConstructorLib.hlsl()
-// NOINLINE-NEXT:   call void @"?SecondEntry@@YAXXZ"()
+// NOINLINE-NEXT:   call void @_Z11SecondEntryv()
 // Verify inlining leaves only calls to "llvm." intrinsics
 // INLINE-NOT:   call {{[^@]*}} @{{[^l][^l][^v][^m][^\.]}}
 // CHECK: ret void
@@ -33,6 +33,10 @@ void SecondEntry() {}
 
 // Verify the constructor is alwaysinline
 // NOINLINE: ; Function Attrs: {{.*}}alwaysinline
-// NOINLINE-NEXT: define internal void @_GLOBAL__sub_I_GlobalConstructorLib.hlsl() [[IntAttr:\#[0-9]+]]
+// NOINLINE-NEXT: define linkonce_odr void @_ZN4hlsl8RWBufferIfEC2Ev({{.*}} [[CtorAttr:\#[0-9]+]]
 
-// NOINLINE: attributes [[IntAttr]] = {{.*}} alwaysinline
+// NOINLINE: ; Function Attrs: {{.*}}alwaysinline
+// NOINLINE-NEXT: define internal void @_GLOBAL__sub_I_GlobalConstructorLib.hlsl() [[InitAttr:\#[0-9]+]]
+
+// NOINLINE-DAG: attributes [[InitAttr]] = {{.*}} alwaysinline
+// NOINLINE-DAG: attributes [[CtorAttr]] = {{.*}} alwaysinline
diff --git a/clang/test/CodeGenHLSL/GlobalConstructors.hlsl b/clang/test/CodeGenHLSL/GlobalConstructors.hlsl
index 7e2f288726c954..7b26dba0d19010 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructors.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructors.hlsl
@@ -12,5 +12,5 @@ void main(unsigned GI : SV_GroupIndex) {}
 //CHECK-NEXT: entry:
 //CHECK-NEXT:   call void @_GLOBAL__sub_I_GlobalConstructors.hlsl()
 //CHECK-NEXT:   %0 = call i32 @llvm.dx.flattened.thread.id.in.group()
-//CHECK-NEXT:   call void @"?main@@YAXI@Z"(i32 %0)
+//CHECK-NEXT:   call void @_Z4mainj(i32 %0)
 //CHECK-NEXT:   ret void
diff --git a/clang/test/CodeGenHLSL/GlobalDestructors.hlsl b/clang/test/CodeGenHLSL/GlobalDestructors.hlsl
index ea28354222f885..f98318601134bb 100644
--- a/clang/test/CodeGenHLSL/GlobalDestructors.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalDestructors.hlsl
@@ -1,7 +1,7 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CS,NOINLINE,CHECK
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=LIB,NOINLINE,CHECK
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -std=hlsl202x -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -std=hlsl202x -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CS,NOINLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=LIB,NOINLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
 
 // Tests that constructors and destructors are appropriately generated for globals
 // and that their calls are inlined when AlwaysInline is run
@@ -59,7 +59,7 @@ void main(unsigned GI : SV_GroupIndex) {
 // Verify destructor is emitted
 // NOINLINE-NEXT:   call void @_GLOBAL__sub_I_GlobalDestructors.hlsl()
 // NOINLINE-NEXT:   %0 = call i32 @llvm.dx.flattened.thread.id.in.group()
-// NOINLINE-NEXT:   call void @"?main@@YAXI@Z"(i32 %0)
+// NOINLINE-NEXT:   call void @_Z4mainj(i32 %0)
 // NOINLINE-NEXT:   call void @_GLOBAL__D_a()
 // NOINLINE-NEXT:   ret void
 // Verify inlining leaves only calls to "llvm." intrinsics
@@ -71,8 +71,8 @@ void main(unsigned GI : SV_GroupIndex) {
 
 // NOINLINE: define internal void @_GLOBAL__D_a() [[IntAttr:\#[0-9]+]]
 // NOINLINE-NEXT: entry:
-// NOINLINE-NEXT:   call void @"??1Tail@@QAA@XZ"(ptr @"?T@?1??Wag@@YAXXZ@4UTail@@A")
-// NOINLINE-NEXT:   call void @"??1Pupper@@QAA@XZ"(ptr @"?GlobalPup@@3UPupper@@A")
+// NOINLINE-NEXT:   call void @_ZN4TailD1Ev(ptr @_ZZ3WagvE1T)
+// NOINLINE-NEXT:   call void @_ZN6PupperD1Ev(ptr @GlobalPup)
 // NOINLINE-NEXT:   ret void
 
 // NOINLINE: attributes [[IntAttr]] = {{.*}} alwaysinline
diff --git a/clang/test/CodeGenHLSL/basic_types.hlsl b/clang/test/CodeGenHLSL/basic_types.hlsl
index 15c963dfa666f4..d987af45a649fb 100644
--- a/clang/test/CodeGenHLSL/basic_types.hlsl
+++ b/clang/test/CodeGenHLSL/basic_types.hlsl
@@ -6,38 +6,38 @@
 // RUN:   -emit-llvm -disable-llvm-passes -o - -DNAMESPACED| FileCheck %s
 
 
-// CHECK:"?uint16_t_Val@@3GA" = global i16 0, align 2
-// CHECK:"?int16_t_Val@@3FA" = global i16 0, align 2
-// CHECK:"?uint_Val@@3IA" = global i32 0, align 4
-// CHECK:"?uint64_t_Val@@3KA" = global i64 0, align 8
-// CHECK:"?int64_t_Val@@3JA" = global i64 0, align 8
-// CHECK:"?int16_t2_Val@@3T?$__vector@F$01@__clang@@A" = global <2 x i16> zeroinitializer, align 4
-// CHECK:"?int16_t3_Val@@3T?$__vector@F$02@__clang@@A" = global <3 x i16> zeroinitializer, align 8
-// CHECK:"?int16_t4_Val@@3T?$__vector@F$03@__clang@@A" = global <4 x i16> zeroinitializer, align 8
-// CHECK:"?uint16_t2_Val@@3T?$__vector@G$01@__clang@@A" = global <2 x i16> zeroinitializer, align 4
-// CHECK:"?uint16_t3_Val@@3T?$__vector@G$02@__clang@@A" = global <3 x i16> zeroinitializer, align 8
-// CHECK:"?uint16_t4_Val@@3T?$__vector@G$03@__clang@@A" = global <4 x i16> zeroinitializer, align 8
-// CHECK:"?int2_Val@@3T?$__vector@H$01@__clang@@A" = global <2 x i32> zeroinitializer, align 8
-// CHECK:"?int3_Val@@3T?$__vector@H$02@__clang@@A" = global <3 x i32> zeroinitializer, align 16
-// CHECK:"?int4_Val@@3T?$__vector@H$03@__clang@@A" = global <4 x i32> zeroinitializer, align 16
-// CHECK:"?uint2_Val@@3T?$__vector@I$01@__clang@@A" = global <2 x i32> zeroinitializer, align 8
-// CHECK:"?uint3_Val@@3T?$__vector@I$02@__clang@@A" = global <3 x i32> zeroinitializer, align 16
-// CHECK:"?uint4_Val@@3T?$__vector@I$03@__clang@@A" = global <4 x i32> zeroinitializer, align 16
-// CHECK:"?int64_t2_Val@@3T?$__vector@J$01@__clang@@A" = global <2 x i64> zeroinitializer, align 16
-// CHECK:"?int64_t3_Val@@3T?$__vector@J$02@__clang@@A" = global <3 x i64> zeroinitializer, align 32
-// CHECK:"?int64_t4_Val@@3T?$__vector@J$03@__clang@@A" = global <4 x i64> zeroinitializer, align 32
-// CHECK:"?uint64_t2_Val@@3T?$__vector@K$01@__clang@@A" = global <2 x i64> zeroinitializer, align 16
-// CHECK:"?uint64_t3_Val@@3T?$__vector@K$02@__clang@@A" = global <3 x i64> zeroinitializer, align 32
-// CHECK:"?uint64_t4_Val@@3T?$__vector@K$03@__clang@@A" = global <4 x i64> zeroinitializer, align 32
-// CHECK:"?half2_Val@@3T?$__vector@$f16@$01@__clang@@A" = global <2 x half> zeroinitializer, align 4
-// CHECK:"?half3_Val@@3T?$__vector@$f16@$02@__clang@@A" = global <3 x half> zeroinitializer, align 8
-// CHECK:"?half4_Val@@3T?$__vector@$f16@$03@__clang@@A" = global <4 x half> zeroinitializer, align 8
-// CHECK:"?float2_Val@@3T?$__vector@M$01@__clang@@A" = global <2 x float> zeroinitializer, align 8
-// CHECK:"?float3_Val@@3T?$__vector@M$02@__clang@@A" = global <3 x float> zeroinitializer, align 16
-// CHECK:"?float4_Val@@3T?$__vector@M$03@__clang@@A" = global <4 x float> zeroinitializer, align 16
-// CHECK:"?double2_Val@@3T?$__vector@N$01@__clang@@A" = global <2 x double> zeroinitializer, align 16
-// CHECK:"?double3_Val@@3T?$__vector@N$02@__clang@@A" = global <3 x double> zeroinitializer, align 32
-// CHECK:"?double4_Val@@3T?$__vector@N$03@__clang@@A" = global <4 x double> zeroinitializer, align 32
+// CHECK: @uint16_t_Val = global i16 0, align 2
+// CHECK: @int16_t_Val = global i16 0, align 2
+// CHECK: @uint_Val = global i32 0, align 4
+// CHECK: @uint64_t_Val = global i64 0, align 8
+// CHECK: @int64_t_Val = global i64 0, align 8
+// CHECK: @int16_t2_Val = global <2 x i16> zeroinitializer, align 4
+// CHECK: @int16_t3_Val = global <3 x i16> zeroinitializer, align 8
+// CHECK: @int16_t4_Val = global <4 x i16> zeroinitializer, align 8
+// CHECK: @uint16_t2_Val = global <2 x i16> zeroinitializer, align 4
+// CHECK: @uint16_t3_Val = global <3 x i16> zeroinitializer, align 8
+// CHECK: @uint16_t4_Val = global <4 x i16> zeroinitializer, align 8
+// CHECK: @int2_Val = global <2 x i32> zeroinitializer, align 8
+// CHECK: @int3_Val = global <3 x i32> zeroinitializer, align 16
+// CHECK: @int4_Val = global <4 x i32> zeroinitializer, align 16
+// CHECK: @uint2_Val = global <2 x i32> zeroinitializer, align 8
+// CHECK: @uint3_Val = global <3 x i32> zeroinitializer, align 16
+// CHECK: @uint4_Val = global <4 x i32> zeroinitializer, align 16
+// CHECK: @int64_t2_Val = global <2 x i64> zeroinitializer, align 16
+// CHECK: @int64_t3_Val = global <3 x i64> zeroinitializer, align 32
+// CHECK: @int64_t4_Val = global <4 x i64> zeroinitializer, align 32
+// CHECK: @uint64_t2_Val = global <2 x i64> zeroinitializer, align 16
+// CHECK: @uint64_t3_Val = global <3 x i64> zeroinitializer, align 32
+// CHECK: @uint64_t4_Val = global <4 x i64> zeroinitializer, align 32
+// CHECK: @half2_Val = global <2 x half> zeroinitializer, align 4
+// CHECK: @half3_Val = global <3 x half> zeroinitializer, align 8
+// CHECK: @half4_Val = global <4 x half> zeroinitializer, align 8
+// CHECK: @float2_Val = global <2 x float> zeroinitializer, align 8
+// CHECK: @float3_Val = global <3 x float> zeroinitializer, align 16
+// CHECK: @float4_Val = global <4 x float> zeroinitializer, align 16
+// CHECK: @double2_Val = global <2 x double> zeroinitializer, align 16
+// CHECK: @double3_Val = global <3 x double> zeroinitializer, align 32
+// CHECK: @double4_Val = global <4 x double> zeroinitializer, align 32
 
 #ifdef NAMESPACED
 #define TYPE_DECL(T)  hlsl::T T##_Val
diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl
index 7ca78e60fb9c59..e1e047485e4df0 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-annotations.hlsl
@@ -16,9 +16,9 @@ void main() {
 }
 
 // CHECK: !hlsl.uavs = !{![[Single:[0-9]+]], ![[Array:[0-9]+]], ![[SingleAllocated:[0-9]+]], ![[ArrayAllocated:[0-9]+]], ![[SingleSpace:[0-9]+]], ![[ArraySpace:[0-9]+]]}
-// CHECK-DAG: ![[Single]] = !{ptr @"?Buffer1@@3V?$RWBuffer@M@hlsl@@A", i32 10, i32 9, i1 false, i32 -1, i32 0}
-// CHECK-DAG: ![[Array]] = !{ptr @"?BufferArray@@3PAV?$RWBuffer@T?$__vector@M$03@__clang@@@hlsl@@A", i32 10, i32 9, i1 false, i32 -1, i32 0}
-// CHECK-DAG: ![[SingleAllocated]] = !{ptr @"?Buffer2@@3V?$RWBuffer@M@hlsl@@A", i32 10, i32 9, i1 false, i32 3, i32 0}
-// CHECK-DAG: ![[ArrayAllocated]] = !{ptr @"?BufferArray2@@3PAV?$RWBuffer@T?$__vector@M$03@__clang@@@hlsl@@A", i32 10, i32 9, i1 false, i32 4, i32 0}
-// CHECK-DAG: ![[SingleSpace]] = !{ptr @"?Buffer3@@3V?$RWBuffer@M@hlsl@@A", i32 10, i32 9, i1 false, i32 3, i32 1}
-// CHECK-DAG: ![[ArraySpace]] = !{ptr @"?BufferArray3@@3PAV?$RWBuffer@T?$__vector@M$03@__clang@@@hlsl@@A", i32 10, i32 9, i1 false, i32 4, i32 1}
+// CHECK-DAG: ![[Single]] = !{ptr @Buffer1, i32 10, i32 9, i1 false, i32 -1, i32 0}
+// CHECK-DAG: ![[Array]] = !{ptr @BufferArray, i32 10, i32 9, i1 false, i32 -1, i32 0}
+// CHECK-DAG: ![[SingleAllocated]] = !{ptr @Buffer2, i32 10, i32 9, i1 false, i32 3, i32 0}
+// CHECK-DAG: ![[ArrayAllocated]] = !{ptr @BufferArray2, i32 10, i32 9, i1 false, i32 4, i32 0}
+// CHECK-DAG: ![[SingleSpace]] = !{ptr @Buffer3, i32 10, i32 9, i1 false, i32 3, i32 1}
+// CHECK-DAG: ![[ArraySpace]] = !{ptr @BufferArray3, i32 10, i32 9, i1 false, i32 4, i32 1}
diff --git a/clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl b/clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl
index 036c9c28ef2779..eca4f1598fd658 100644
--- a/clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/RWBuffer-elementtype.hlsl
@@ -37,16 +37,16 @@ void main(int GI : SV_GroupIndex) {
   BufF32x3[GI] = 0;
 }
 
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufI16@@3V?$RWBuffer@F@hlsl@@A", i32 10, i32 2,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufU16@@3V?$RWBuffer@G@hlsl@@A", i32 10, i32 3,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufI32@@3V?$RWBuffer@H@hlsl@@A", i32 10, i32 4,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufU32@@3V?$RWBuffer@I@hlsl@@A", i32 10, i32 5,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufI64@@3V?$RWBuffer@J@hlsl@@A", i32 10, i32 6,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufU64@@3V?$RWBuffer@K@hlsl@@A", i32 10, i32 7,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufF16@@3V?$RWBuffer@$f16@@hlsl@@A", i32 10, i32 8,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufF32@@3V?$RWBuffer@M@hlsl@@A", i32 10, i32 9,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufF64@@3V?$RWBuffer@N@hlsl@@A", i32 10, i32 10,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufI16x4@@3V?$RWBuffer@T?$__vector@F$03@__clang@@@hlsl@@A", i32 10, i32 2,
-// CHECK: !{{[0-9]+}} = !{ptr @"?BufU32x3@@3V?$R...
[truncated]

Copy link
Contributor Author

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments to explain the changes that aren't just mangling style changes.


// CHECK-LABEL: define noundef i64 @_Z6shru64mm(i64 noundef %V, i64 noundef %S) #0 {
// CHECK-DAG: %[[Masked:.*]] = and i64 %{{.*}}, 63
// CHECK-DAG: %{{.*}} = lshr i64 %{{.*}}, %[[Masked]]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all incidental, I just thought we should test some logical shifts.

// CHECK: store i32 [[Arg1Val]], ptr [[V]]
// CHECK: [[Arg2Val:%.*]] = load i32, ptr [[Tmp0]]
// CHECK: [[Arg2Val:%.*]] = load i32, ptr [[Tmp1]]
Copy link
Contributor Author

@pow2clk pow2clk Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the new order of copy in copy out is reflected

// CHECK: [[BFValue:%.*]] = and i16 [[TruncVal]], 255
// CHECK: [[ZerodField:%.*]] = and i16 [[BFLoad]], -256
// CHECK: [[BFSet:%.*]] = or i16 [[ZerodField]], [[BFValue]]
// CHECK: store i16 [[BFSet]], ptr [[B]]
Copy link
Contributor Author

@pow2clk pow2clk Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an instance of using smaller types for bitfields. Doing so introduced some additional commands.


using hlsl::abs;

#ifdef __HLSL_ENABLE_16_BIT
// NATIVE_HALF: define noundef i16 @
// NATIVE_HALF-LABEL: define noundef i16 @_Z16test_abs_int16_t
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a lot of inconsistency in these sorts of checks. Some included more and some less. I tried to include enough to unambiguously identify the function in question.

// CHECK-NEXT: store ptr {{.*}}, ptr [[ThisPtrAddr]]
// CHECK-NEXT: [[ThisPtr:%.*]] = load ptr, ptr [[ThisPtrAddr]]
// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[ThisPtr]], ptr align 4 [[Obj:%.*]], i32 8, i1 false)
// CHECK-NEXT: [[FirstAddr:%.*]] = getelementptr inbounds nuw %struct.Pair, ptr [[ThisPtr]], i32 0, i32 0
// CHECK-NEXT: [[First:%.*]] = load i32, ptr [[FirstAddr]]
// CHECK-NEXT: [[FirstPlusTwo:%.*]] = add nsw i32 [[First]], 2
// CHECK-NEXT: store i32 [[FirstPlusTwo]], ptr [[FirstAddr]]
// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[AggRes]], ptr align 4 [[Obj]], i32 8, i1 false)
// CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr align 4 {{.*}}, ptr align 4 [[Obj]], i32 8, i1 false)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't explain exactly why the ResPtr alloca was removed, but it was clearly unnecessary. The wildcard here represents the same location as was previously captured in AggRes and copied into ResPtr, but then ResPtr was never used.

// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CS,NOINLINE,CHECK
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=LIB,NOINLINE,CHECK
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -emit-llvm -O0 %s -o - | FileCheck %s --check-prefixes=INLINE,CHECK
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hlsl202x is the default now, so this flag is redundant.

// RUN: FileCheck %s --check-prefixes=CHECK,NATIVE_HALF
// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.3-library %s \
// RUN: -emit-llvm -disable-llvm-passes -o - | \
// RUN: FileCheck %s --check-prefixes=CHECK,NO_HALF
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file extension already has the same effect as the flag -x hlsl

// CHECK-NEXT: = or i32 [[Tmp2]], 1
// CHECK: init.check:
// CHECK-NEXT: call void @_ZN4hlsl8RWBufferIiEC1Ev
// CHECK-NEXT: store i8 1, ptr @_ZGVZ4mainvE5mybuf
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes represent a different, but equivalent way of protecting the one-time initialization of a static local variable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:DirectX clang:codegen clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[DirectX] Switch to Itanium ABI
2 participants